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Abstract 

We study fault tolerant rumor spreading algorithms in the complete graph topology. Our 
focus is on algorithms that use minimum communication both in a global and local sense: 
they establish the minimum possible number of inter-processor connections in total, and in 
each round each processor is involved in at most one connection. The challenge is designing 
such algorithms that have an asymptotically optimal, that is, logarithmic, time complexity 
even in the presence of failed nodes. 

We first show that if nodes are crashed not adversarially, but independently at random 
with constant probability less than one, then already the basic GP algorithm of Gasieniec 
and Pelc (Parallel Computing 22:903-912, 1996) with high probability has an asymptotically 
optimal O(logn) time complexity. This improves significantly over the worst-case guarantee 
of / + O(logn) given there for / crashed nodes. 

We then show that by adding randomization to the algorithm, these time and com- 
munication complexities can be maintained also against adversarial failures. This is easily 
achieved by running the GP-algorithm with randomly permuted node labels, at the price, 
however, that this permutation (or at least significant parts of it) also have to be dissemi- 
nated. To overcome this, we show that the random permutation can be chosen from a set 
of only w(n/logn) permutations. Consequently, the permutation can be communicated by 
adding 6(logn) bits to each message, which is an overhead produced by many communica- 
tion protocols including the GP algorithm. Naturally, this requires all processors to know 
this set of permutations, which needs u>(n 2 ) space at each processor and some preliminary 
communication to set up the system. 

Keywords: Rumor Spreading; Message Broadcasting; Randomized Algorithms; Robust- 
ness; Analysis of Algorithms 



1 Introduction 

Disseminating information to all nodes of a network is one of the basic communication primitives. 
Basically all collaborative actions in networks imply that some information has to be sent to all 
nodes, and surprisingly complex tasks like computing aggregates can be reduced to essentially 
solving a dissemination problem MAS06|. 
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1.1 Previous Results 

In this work, we shall be interested in disseminating a single piece of information to all n nodes 
of a complete communication network (that is, any node can communicate with any other). 

Basically two types of communication protocols have been investigated for such problems, 
(i) deterministic ones aiming for optimal dissemination times and minimal communication effort 
(number of messages sent) and (ii) gossip-based ones building on the paradigm that nodes call 
randomly chosen others. The latter, due to their randomized nature, usually are highly robust 
against all kinds of faults, typically at the price of a higher communication effort and a slightly 
larger runtime. 



1.1.1 Deterministic Protocols 

It is easy to see that there are deterministic protocols disseminating a rumor in [log 2 (n)] 
communication rounds using a total of n — 1 messages. A simple protocol for n = 2 k nodes 
indexed by the numbers from to n — 1 would be that in round i, each node x having the rumor 
calls node x XOR 2 J_1 and forwards the message to it. From the sender ID the recipient of a 
message can infer the round number i, and thus decide when to stop forwarding the message. 
Hence this protocol indeed uses only n — 1 messages in total. This algorithm also, like the other 
ones mentioned in this subsection, maintains the following appealing whispering property: in 
each round, the edges along which the rumor is transferred form a matching. 

The downside of this simple approach is that it is not at all robust. If a node is not available 
("crashed"), then all other nodes that would be been informed via it will remain uninformed. 
This problem was overcome in |GP96 . Their basic protocol is optimal if no failures occur (i.e., 



it takes [log 2 (ra)] communication rounds and n — 1 messages); however, when arbitrary / nodes 
do not participate in the collaborative process, then — instead of nodes remaining uninformed — 
only the runtime increases to at most / + [~log 2 (re — /)] . The number of messages sent in all 
cases is n — 1. In this result, it is assumed that a node calling a crashed node learns that his 
call was unsuccessful. 

While robust in the sense that a crashed node do not lead to nodes remaining uninformed, the 
increase of the runtime bound (equal to basically the number of crashed nodes) is dissatisfactory. 
This increase in the runtime is avoided in |DP00|, by improving the more advanced algorithm 
of |GP96| . They achieve an O(logn) runtime by adding to the algorithm an opening phase 
which takes O(logn) time and 0(n) messages, in which the non-faulty nodes are identified and 
organized in a certain subgraph. This opening phase requires the activation of all non-faulty 
nodes — including uninformed ones — at the very first round, and it uses an intricate sub-structure 
of the network, which is defined by a bound on the number / of faulty processors, and needs to 
be computed and stored in a preprocessing time. 

One of the algorithms presented in this paper demonstrates that by introducing a simple 



randomization to the basic algorithm of GP96 , it is possible to maintain optimal time (w.h.p.) 



and minimal communication (always), without assuming prior knowledge on the number of fail- 
ures /, without activating uninformed nodes, and by using a considerably simpler preprocessing 
stage. 



1.1.2 Gossip-based Protocols 

Gossip-based communication protocols build on the paradigm that nodes of a network call 
random neighbors and communicate with them. This is also called randomized rumor spread- 
ing. Randomized rumor spreading has been analyzed in various variants for different network 
topologies. We briefly note that, despite the very simply approach of talking to random neigh- 
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bors, these protocols often achieve a surprisingly good runtime combined with extreme ro- 
bustness. We now describe the results relevant for our work and point the reader exemplarily 
to CHHKMl~2||DFFlT Giall|[FPRU90 and the references therein for results on other topologies 
than the complete one. 



The first rumor spreading result is due to Frieze and Grimmett |FG85| , who use it to analyze 
a shortest path problem. They show that the simple protocol consisting of each informed node in 
each round calling a random neighbor (synchronized push-protocol) with high probability (that 
is, at least 1 — o(l)) forwards a rumor from a single node to all n nodes of a complete network 
in log 2 (n) + ln(n) + o(logn) rounds. This was sharpened to log 2 (n) + ln(n) + h(n), where h(n) 
is any function tending to infinity, by Pittel [Pit87| . Note that randomized rumor spreading in 
the push-model does not automatically satisfy the whispering property, but without loss can be 
made to do so, simply by assuming that nodes accept only one incoming call. 

The first to analyze rumor spreading as communication protocol (namely in the context 



of maintaining the consistency of replicated databases) were Demers et al. DGH + 88 . In ap- 
plications like this, where one may assume that updates are to be disseminated frequently, 
also a push-pull randomized rumor spreading protocol makes sense. Here all nodes (and not 
only those already knowing the rumor) call random neighbors, allowing that uninformed nodes 
"pull" information from informed ones. Naturally, for the use of pull operations not to generate 
an enormous message overhead, one needs the assumption that sufficiently often rumors are 
injected. 

Another weakness of using randomized pull operations is the violation of the whispering 
property, due to the fact that many uninformed processors may randomly select the same 
informed one. Specifically, using, e.g., [RS98] , it is not hard to see that in push-pull gossip- 
based algorithm, in each round in which the fraction of informed processors is bounded away 
from and from 1, w.h.p. some informed processor sends the rumor to ^( i^fj 1 n ) uninformed 
neighbors. For these reasons, we shall regard protocols building only on push operations. The 
reader is referred to |KSSV00 , AGGZTo] for two beautiful results on push-pull rumor spreading 
in complete graphs. 

While always randomized rumor spreading is described as highly robust, not too many 
proofs of this statement exist. Elsasser and Sauerwald in [ES09 prove for general graphs that 
messages failing independently with probability c < 1 lead to an increase of the runtime of at 
most 0(1/(1 -c)). In |DHL09| , this was made more precise for complete network topologies by 
showing that the push variant of randomized rumor spreading, with high probability, forwards 
a rumor to all nodes in time log 2 _ c (n) + (lnn)/(l — c) + o(log n), when each call independently 
fails with probability c. It is not difficult to see that the same bound (for the time needed 
to inform the working nodes) holds when a fraction of / = cn of arbitrary nodes is crashed 
initially. This shows that randomized rumor spreading is indeed highly robust. 

The robustness of randomized rumor spreading comes at a price not only of a slightly higher 
dissemination time when not failures occur, but more importantly at a relatively large number 
of messages sent until the rumor is disseminated, and at a large number of additional messages 
caused by the fact that in the basic protocol the nodes do not know when to stop sending out 
messages: In independent randomized rumor spreading in the push- model, only after O(ralogn) 
messages are sent, the rumor is known to all vertices. By adding suitable dependencies to 



the random choice of the communication partner, in DFlla a randomized rumor spreading 
protocol was designed which in (1 + o(l))log 2 (n) rounds and with nh(n) messages solves the 
dissemination problem (here again h is an arbitrary function tending to infinity) . This protocol 
is robust against random node crashes (leading to a time bound of (1 + o(l)) log 2 _ c (n) if / = cn 
nodes are crashed). The protocol has a simple termination criterion, that is, the bounds on the 
number of messages not only refer to the messages sent until all nodes are informed, but in fact 
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to the total number of messages sent in one run of the protocol. The protocol can be made 
robust also against adversarial node crashes, however, at the price of the message complexity 
increasing to 0(nlogn) when a constant fraction of adversarial node crashes has to be tolerated. 



1.2 Our Results 

The previous work described above suggests that gossip-based rumor spreading and its variants 
are not too suitable to develop dissemination protocols that are efficient in terms of the number 
of messages sent. For this reason, we build on the approach taken by the above mentioned basic 



algorithm of Gasieniec and Pelc GP96 , to be denoted GP. 

We first show that for random node crashes, the GP algorithm has a much better perfor- 
mance than what the worst-case bound in GP96] states. In particular, when each node is 



crashed with constant probability less than one (independently at random), with high proba- 
bility, the remaining nodes are informed in time O(logn). 

For adversarial node failures, this implies a straight-forward randomized solution: The start 
node (i.e., the node which the rumor starts at) picks a random permutation of the other nodes 
and initiates the GP protocol with node labels permuted according to this permutation. This 
gives the same time bounds as for random node failures. The downside is that to make the 
other nodes adopt this strategy, sufficient information on the permutation chosen by the initial 
node has to be communicated to the other nodes as well. This can be achieved by adding a 
total of 0(n log 2 n) bits to the messages, with at most n bits to each message. 

When communication is costly, message sizes can be shortened to O(logn): We prove that 
instead of choosing the permutation randomly from all permutations, it suffices to choose the 
permutation randomly from a set of only uj{nj log n) permutations (this number can be varied to 



adjust runtimes and failure probabilities, see Theorem 14 for the details). This allows to encode 
the permutation via only 0(logn) bits (which is also the overhead of the original GP protocol). 
This approach can be implemented by computing w(n/logn) random permutations and storing 
them at all processors, and repeating this procedure whenever processors join or leave the net- 
work. Thus, this protocol is particularly appealing when communication is expensive, memory 
is cheap, and processors are not added or removed from the network too often. 

To the best of our knowledge, our two randomized variants of the GP protocol are the only 
published rumor spreading protocols which use the minimal possible number of n — 1 messages 
no matter which nodes are crashed, and achieve the asymptotically optimal dissemination time 
of O(logn). The first one achieves it at the price of adding a total of 0(nlog 2 n) bits to the 
messages sent and O(n) bits to some messages, while the second requires a memory of size u>(n 2 ) 
bits per processor. We did not try hard to optimize the constants in the runtime bounds, but 
(6 + o(l)) log 2 n follows easily from our proofs. 



2 Preliminaries 

Before we present a few basics about rumor spreading protocols, let us briefly fix the notation 
used throughout this work. Unless stated otherwise, we consider executions of rumor spreading 
algorithms by n processors ordered by their names (0, 1, ... , n— 1), where is the start processor. 

We use the following notation: For a sequence s = (s±, S2, ■ ■ •), odd(s) = (s±, S3, . . .) is the 
subsequence of the odd indexed elements of s, and even(s) = (s2, S4, . . .) is the subsequence of 
the even- indexed elements of s. For a binary vector b , \b |o is the number of zeros in b , and \b |i 
denotes the number of ones in b . 

For a rooted tree T, height(T) is the height of T, i.e., the maximum length of a path from 
the root to a leaf. 
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For n S N (N denotes the positive integers) we abbreviate [n] := {1,2, . . . ,n}. By S n we 
denote the set of all permutations of the set [n] . 

By In we denote the natural logarithm to base e. All other logarithms are to base 2. 

2.1 Rumor Spreading Protocols 

To ease the comparison of our rumor spreading protocol with previous ones, let us briefly 
give a unified description of these. Let the undirected graph G = (V, E) describe the under- 
lying communication network, that is, nodes of this graph represent processors and a direct 
communication between two processors is possible if and only if there is an edge between the 
corresponding nodes. Let n := \ V\. 

A (synchronous) execution of a rumor-spreading algorithm on G consists of rounds 
Mi,]R2,--" A round M t is initiated by a set of processors Vt C V (the exact nature of Vt 
depends on the model assumed and/or on the specific algorithm): each processor u € Vt sends 
a (u,v) communication request (in short u (u,v) request") to one of its neighbors, v; the request 
contains a bit informing v whether it holds the rumor already. A (it, v) request is valid if exactly 
one of u,v holds the rumor. After receiving all the requests sent to it at Mj, each processor v 
may (but does not have to) approve some of the valid requests that it has received. The round 
M.t is then completed by transferring the rumor along the edges of the approved requests^] The 
execution terminates at time t if Vt ^ and Vt+i = 0. We call t the time (or round) complexity 
of the execution of the rumor spreading algorithm. Note that in some other works, in particular 
those on randomized rumor spreading, only the first time at which all processors know the 
rumor is regarded. 

Let Et denote the set of edges along which requests are sent in R t; and let F t denote the set of 
edges of the approved requests, along which the rumor is transferred at Mt (thus \Et\ = \Vt\ < n 
and Ft C Et). A rumor spreading algorithm satisfies the whispering property if Ft always forms 
a matching, meaning that each processor may either send or receive at most one copy of the 
rumor at each round. 

Besides the time complexity, the communication effort and the robustness against faults 
are two further important performance measures. There are some variants of the defini- 
tion of message complexity of rumor spreading algorithms. The strictest definition counts 
all communication requests, i.e. Y2t 1-^*1 ( e -§- (GP96 DFllb ). A more permissive def- 



inition assumes that communication between uninformed processors is given for free due 
to frequent injections of other rumors ( [KSSV00[|CHHKM12| ), and hence it reduces to 



Y2 t \ {( u i v ) '■ ( U 7 V ) £ E t and either u or v holds the rumor }|. As will be noted soon, our al- 
gorithms have the minimum possible message complexity by both definitions. 

The faults assumed in this paper are initial node failures: A processor is faulty in a given 
execution if it never sends a message during the execution. We consider two types of failure 
policies, associated with a success parameter p £ (0, 1): random failures, in which each process 
may fail independently with probability 1—p, and adversarial failures, in which the adversary 
may fail (before the execution of the algorithm starts) any subset of up to (1 — p)n processors, 
excluding the start processor. An (i,j) request is failed if j is faulty, and it is successful otherwise. 
Note that in our synchronized model, a faulty node j is identified by not responding to an 
request. 

1 Some models assume that an informed processor u always sends the rumor on the selected edge (u,v), even 
if v is already informed. 
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2.2 The Algorithm of Gasieniec and Pelc 



We use the following variant of the divide-and-conquer algorithm of Gasieniec and Pelc |GP96 



to be denoted GP. Initially the start processor holds a list (1, 2, . . . , n — 1) of all uninformed 
processors, and all other processors hold empty lists. At each round, each processor i which 
holds a nonempty list . . . ,jk), sends an request and deletes j\ from its list. If the 

request is successful then i also sends to j\ the rumor, appends to it the list even(j2, . . . , jk) = 
(j3j J5) • • •)> an d sets its own list to odd(j2, . . . , jk). Thus, in this case, the next round starts 
with i holding the list odd(j2, . . . , jk) and processor j\ holding the list even(j2, . . . ,jk)- The 
algorithm terminates when all processors hold empty lists. 

Implementation note: Observe that each list of the form even(j2, . . . , jk) generated during 
the algorithm is an arithmetic progression whose difference is 2 m for some integer m < logn. 
Sending such a list can be done by sending the first element j'3, the length of the list L^y^J > an d 
the exponent m. Overall, this requires an addition of less than 3 logn bits to the rumor. 

Note that this protocol automatically ensures that (i) each node receives at most one com- 
munication request per round (hence the whispering property is satisfied), (ii) only requests 
from informed nodes to uninformed ones are issued (hence there is no reason not to approve a 
request), and (iii) the protocol terminates as soon as all processors know the rumor. 

The optimality of the message complexity of the GP algorithm (under the different variants 



of "message complexity" discussed in Section 2.1) is implied by the following straightforward 
observation. 

Lemma 1 ( |GP96| ). The GP algorithm performs the minimum possible number of communi- 
cation requests, namely n — 1 communication requests in each possible execution. 

Lemma 2 ( |GP96| ). In the presence of f initial node failures, the time complexity of the GP 
algorithm is at most f + |~log(n — /)] . This bound is tight if processors 1, . . . , / are failed. 

2.3 Reminder: Chernoff' s Bounds 

We apply several versions of Chernoff's bound. The following can be found, for example, 
in [DP09| . 



Theorem 3 (Chernoff's bounds). Let X = Y27=l be the sum of n independently distributed 
random variables Xi, where each variable Xi takes values in [0, 1]. Then the following statements 
hold. 

Vi > : Pr[X > ELY] + t] < exp(-2i 2 /™) and (1) 

PrLY < E[X] -t]< exp(-2t 2 /n) . 
Ve > : Pr [X < (1 - e) E[X]] < exp ( - e 2 E[X]/2) and (2) 

Pr [X > (1 + e) E[X]] < exp ( - e 2 E[X]/3) . 
Vt > 2e E[X] : Pr[X > t] < 2~* . (3) 

Chernoff's bound applies also to random geometric variables. A proof of the following 
theorem can be found, e.g., in |AD11 Theorem 1.14]. 



Theorem 4 (Chernoff bound for random geometric variables). Let p £ (0, 1). Let X\, . . . , X n 

be independent geometric random variables with Pr[Xj = k] = (1 — p) k ~ x p for all k G N. Let 

Then for all 5 > 0, Pi[X > (1 + 5) ELY]] < exp (- ffi+fl 
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3 Random Failure Analysis of the GP Algorithm via a New 
Random Wakeup Model 



In this section we show that the GP algorithm has a much better performance against random 
node failures than the worst case performance given in Lemma [2] against adversarial node 
failures. We assume that each processor may fail with probability 1—p independently. It is not 
hard to see that the expected runtime is bounded from below by the solution to the recursive 
formula F(l) = 0;F(n) = p-F(n/2) + (l-p)-F(n-l) + l, which is log(n)/p+0(l). On the other 
hand, we show that every processor is informed after ^ log n rounds, with high probability. 

We refer to p as the success rate, and to 1 —p as the failure rate. Due to the sequential nature 
of the GP protocol, even a very small change in the failure pattern (that is, the set of failed 
nodes) may imply a large change in the time complexity. This makes a straightforward analysis 
of this model a bit tricky. To ease the analysis, we start by considering a similar protocol in a 
simpler model, the random wakeup model, which we believe to be of independent interest. A 
natural coupling argument allows us to transfer results from the random wakeup model to the 
standard random node failure model. 



3.1 The Random Wakeup Model 

We regard the following divide-and-conquer wake-up protocol, which is inspired by the GP 
algorithm. The start processor starts with the list (l,2,...,n — 1) of nodes to be woken up. 
It does so by waking up processor 1 and forwarding the list even(2, ...,n — 1) to it, keeping for 
itself the list odd(2, n — 1) as his todo-list. The difference from the standard failure model is 
that each wakeup request is successful with probability p, independently of previous requests. 
Hence in the implied rumor spreading algorithm, to be denoted WU, whenever u selects an edge 
(u,v), it repeatedly sends (u,v) requests until v is woken up. Informally, the time complexity 
of the algorithm in this model is larger than in the standard initial-failures model, since in the 
standard model only one request is sent to each processor (a formal proof of this is given in 



Section 3.3.2). Note also that like the GP algorithm, the WU algorithm performs the minimum 
possible number of communication requests in each execution: n + / — 1 requests when there 
are / failed wakeup messages. 

The time complexity of the random wakeup model is easier to analyze since the implied 
WU algorithm sends communication requests along a fixed set of edges, which is independent of 
the specific failure pattern. For analyzing this time complexity we represent the WU algorithm 
by a full binary tree T with n leaves, in which each vertex x is labeled by a processor name 
L(x) G {0, . . . , n — 1} according to the following scheme (cf. Figure [I]). The leaves of 7~ are 
labeled by the processor names 0, . . . , n — 1, according to some arbitrary but fixed order. The 
labeling of an internal vertex x with children y,z is L(x) = min{L(y), L(z)}. Thus L(r) = 
(where r is the root of the tree), and for each processor k, the vertices of T labeled by k form 
a directed path, Path^, ending at a leaf of T. 

The algorithm for processor k E {0, . . . , n — 1} implied by the above labeled tree T is the 
following: After receiving the rumor, k moves along the vertices of Path/%. When k steps on 
a non-leaf vertex x E Path^ with children y, z, it repeatedly sends communication requests to 
j = max{L(y), L(z)} until j wakes up. 

Consider now a specific execution £\yu °f the above random wakeup algorithm. For each 
internal vertex x G T with children y,z, let k x = L(x) and j x = max{L(y), L(z)}. Denote by 
Delay(x) the number of (k x ,j x ) requests sent by k x in £wu- Then Delay(x) is a geometric 
random variable with probability p, that is Pr[Delay(x) = k] = (1 — p) k ~ 1 p for all positive 
integers k, and E(Delay(x)) = 1/p. 
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Figure 1: Illustration of the rumor spreading in the random wakeup model for 5 
processors: Each vertex of T is labeled by a processors name. The red bold edges indicate 
rumor transfers. Thus Processor always transfers the rumor to processors 3, 2, and 1 (in this 
order). The numbers in parentheses beneath internal vertices indicate the number of wakeup 
calls in a specific execution. That is, in the depicted execution processor 1 woke up only by the 
7th (0, 1) request. The time complexity of this execution is 3 + 2 + 7 = 12. 



For a processor j E [0 . . . n — 1], let Pj be the path from the root r of T to the (unique) 

leave labeled by j, and let Delay(Pj) = ^2 xeP . Delay(x). Then the time complexity of £\yu is 
given by 

time(£wu) = max {DelayfP,)}. 

je[0...n-l] 

3.2 The Time Complexity of the Random Wakeup Model 

Theorem 5. Let c > 1 be a constant and let p 6 (0, 1) be arbitrary (possibly p = 1 — o(l) ). 

With probability at least 1 — nexp (^ — ^-^)—(\\og{n — 1)] — 1)^, the WU algorithm with suc- 
cess rate p has delivered the rumor to all processors after ^([~log(n — 1)] + 1) rounds. 

The success probability in Theorem [5] becomes 1 — o(l) for c with ^in2 ^ ^' e, §'' ^ or c — V^- 
The theorem follows essentially from the Chernoff bound for random geometric variables, cf. 
Theorem |4j 

Proof of Theorem^ By construction, for each processor j 6 [0 ... n — 1] we have that Delay(Pj) 
is at least [log(n — 1)] and it is at most [log(n — 1)] + 1. Therefore the expected delay of path 
Pj, E[Delay(Pj)], equals (l/p)[~log(n — 1)] or (l/p)([~log(n — 1)] + 1), respectively. Since the 



S 



variables {Delay (x) : x £ Pj} are mutually independent, by Theorem [4] we have 



Pr[Delay(P j ) > (c/p)(ftog(n - 1)1 + 1)] < Pr[Delay(P i ) > (1 + (c - 1)) E [Delay (P,-)]] 

(c-1) 2 



< 



exp 



2c 



-(flog(n-l)l -1) 



A simple union bound over all n paths concludes the proof. 



□ 



3.3 Coupling the GP and WU Models 

To relate the time complexities of the random wakeup model and the GP algorithm in the 
presence of random node failures, we embed the failure patterns of both models in the probability 
space consisting of infinite binary vectors {b \ b G {0, 1} N }, representing infinite sequences of 
binary i.i.d. random variables (infinite sequences are needed since the number of possible failures 
in executions of the WU algorithm is unbounded). This embedding of failure patterns induces 
distributions over executions of rumor spreading algorithms, similarly to the way randomized 



algorithms are presented in the classical work of Yao Yao77 



In Section 3.3.1 we define the mappings of infinite binary vectors to failure patterns, and 
then to execution trees (whose heights represent time complexity of the corresponding execution 



of the GP and the WU protocol, respectively). In Section 3.3.2 we use this mapping to present 
our coupling argument (Lemma [7]). 



3.3.1 Failure Patterns and Execution Trees 

Any execution of the GP or of the WU algorithm with a single start processor is determined by 
the initial system configuration (in short configuration). A configuration is a pair (n,b), where 
n is the number of processors to which the rumor has to be delivered, and b = (bi, b%, . . .) is 
an infinite binary vector representing a failure pattern. An entry b% = corresponds to a failed 
request and bi = 1 corresponds to a successful request. For each configuration (n,b), £gp(^>&) 
denotes the execution of the GP algorithm on (n,b), and £y/\j(n,b) denotes the execution of 
the WU algorithm on (n, b) (£cp( n )&) is always determined by the first n bits of 6, while 
%j(n, b ) is usually determined by a longer prefix of b ). 

£gp (n,b) is defined by the execution tree Tqp (n, b ) as follows. The vertices of Tqp (n, b ) are 
configurations. The root of TQp(n,b) is the configuration (n, b). If 6 = 0c (for some infinite 
binary vector c ) then the first request sent by the execution failed. Hence in the next round 
there is still only one informed processor, with n — 1 uninformed processors in its list. Thus the 
only child of (n, b) is (n— l,c). If b = lc then the first request is successful, and hence (n,b) 
has a left child ( l 1 ^] , odd(c)) and a right child ([^^J , even(c)). 

This rule applies to all vertices of the tree: For k > and binary (infinite) vector c, a vertex 
(k,0c) in Tqp is an internal vertex with one child: (k — l,c), and a vertex (k,lc) has a left 
child (|~^=i],odd(c)) and a right child ([^^J , even(c)). Vertices of the form (0,c) are leaves. 
Figure [2] illustrates the execution tree of the GP Algorithm. 

The execution £\yu( n ) b) of the WU algorithm with initial configuration (n, b) is described 
by an execution graph T^\j(n,b) in a similar manner, with one exception: the unique child of 
a vertex of the form (k, 0c ) for k > is (k, c) (and not (k — 1, c )) — reflecting the fact that in a 
failed request the number of uninformed processors remains unchanged. Note that Twu(n,b) 
may not be a tree, since it may contain vertices (configurations) of the form (k, N ) which have 
one outgoing edge which is a self loop (corresponding to the event of infinite sequence of failed 
requests by a processor). It is not hard to see that T\yu(?i) b) has no other cycles and no other 
directed infinite paths. Hence Twu(n, b) is a finite rooted tree or a finite rooted tree with self 
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Figure 2: Tqp(4; 1001...): This tree describes the execution £gp(4; 1001 . . .) of the GP algorithm 
for 5 processors and a failure pattern (1001...). Each vertex is a system configuration (k;b), 
where k is the number of processors to which the rumor need to be delivered, and b is the 
corresponding failure pattern. 



loops added to some of its leaves. This latter case correspond to executions in which some 
processor has an infinite succession of failures. 

The following observation is implied by the definitions of Tqp and ?wu- 

Observation 6. For each system configuration (n,b) it holds that: 

1. The time complexity of £Gp(n,b) is height (Tqp(u, b)). 

2. If Tw\j(n,b) contains a self loop, then the time complexity of £w\j(n,b) is infinite. The 
time complexity of £y/\j(n,b) is height(Twu(^> b )), otherwise. 

3.3.2 Coupling the Models 

Here and in the remainder of the paper we abbreviate hcp(n,b) = height(Tcp(n, b )) and 
h W u(n,b) = height(T wu (n,6)). 

The main coupling argument is the following lemma, whose inductive proof makes use of 
the fact that both functions hop and hwu are monotone increasing in their first argument (i.e., 
the number of processors to be informed). 

Lemma 7. For each system configuration (n,b) it holds that hcp(n,b) < hwu( n ,b). 

For proving Lemma [7| we first observe that both hep and hwu are monotone increasing in 
the number of uninformed processors. 
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Lemma 8. Let h G {hop, hwu}- The function h is monotone increasing in its first argument. 

This lemma follows immediately from the observation that for all 6 and n, TQp(n, 6) is 
isomorphic to a proper subtree of Tqp(u + 1,6), and Tw\j(n,b) is isomorphic to a proper 
subgraph of Twu( ra + 1> 6 ). 

We are now ready to prove the main coupling argument, Lemma [7j 

Proof of Lemma^7\ For n = and for all vectors 6 G {0, 1} N we have 

h GP {0,b) = = h wu {0,b) . 

For the all-zeros vector 6=0 and for all n > it holds that 

h G p(n, 0) = n < oo = hwu(n, 0) . 

We proceed by induction on n, assuming 5/0. Let 6& be the first non-zero element in 6 (for 
some k > 1). It follows that 

h GP (n,b) =k + max{h GP ([^] , odd(6 fc+ i, ...)) ,h GP (L^J , even(6 fc+ i, ...))} , 

which, by induction hypothesis, can be bounded from above by 

k + max{h W u (Pir! > odd(6 fe+ i, ...)) (L^J , even(6 fe+ i, ...))} , 

which, by Lemma [8j is itself bounded from above by 

k + max{h wu (p^],odd(6 fe+ i,...)) ,h wu (L^J , even(6 fc+ i, ...))} 
=h W u{n,b) . 

□ 

Lemma [7] and Observation ^ show that, for any initial configuration (n, 6), the execution 
£cp(n, 6 ) of the GP algorithm is at least as fast as the execution £-w\j(n, b ) of the WU algorithm. 
This implies that for any probability distribution D on {0, 1} N , if 6 is sampled from D then 
Pr[h GP (n, 6 ) < H]> Pr[h wu (n, 6) < H}. By letting D be the distribution on {0, 1} N as defined 
in Section |3.3| above, Theorem [5] easily implies the following. 

Theorem 9. Let c > 1 be a constant. The execution time of the GP algorithm with suc- 
cess probability p G (0,1) is at most ^([~log(n — 1)] + 1), with probability at least 1 — 

nex P (-^(riog(n-l)l -1)). 



4 Adversarial Failures in the Randomized GP-Protocol 

In this section we aim at analyzing adversarial failures. As mentioned in Lemma [2j it has been 



proven in GP96 that the time complexity of the GP algorithm is at most / + [log(n — /)] 



when the number of failures is at most /. This bound is sharp when the first / nodes fail. For 
/ = o;(log?T,), this bound is not satisfactory. 

We give a modification of the GP algorithm which is fault-tolerant in the sense that no 
matter which constant fraction of the node fails, every processor receives a contact request 
within the first O(logn) rounds, with high probability. This protocol can best be described as 
a randomized version of the basic GP algorithm. 
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Our algorithm works as follows. When the rumor is injected at processor 0, this processor 
picks a permutation ir G S n -i uniformly at random. In round one it tries to contact processor 
7r(l). If this processor has a failure, processor sends a communication request to processor 
7r(2) in round two. Otherwise, i.e., if processor 7r(l) is not failed, processor sends to it the 
rumor and appends to this rumor the list even(7r(2), . . . ,7r(n — 1)). Processor keeps the list 
odd(-7r(2), . . . , 7r(n — 1)). The protocol continues as described in Section 2.2 That is, all we have 
changed in our randomized version of the GP algorithm is to substitute the list of processor 
— which is (1, . . . , n — 1) in the original GP algorithm — to (vr(l), . . . , ir(n — 1)), where ir is a 
random permutation of [n — 1]. We also have to append information on tt when transferring 
the rumor. It is not difficult to see (see Section 4.1 below) that this requires a total number of 
0(nlog 2 n) bits that are appended to the rumors (compared to O(nlogn) in the GP algorithm). 
The maximum length of an individual message appendix is n bits. 

Here and in the remainder of this section we assume (as in all other parts of this work) that 
the processor initially holding the rumor, node 0, does not fail. Recall that in our initial node 
failure model, a processor either is a failed one or it does work throughout the execution. 

Before we analyze the time complexity of the randomized GP algorithm, let us briefly discuss 
its bit complexity] i.e., the number of bits needed to encode the lists that are appended to the 
initial rumor. 



4.1 The Bit Complexity of the Randomized GP Algorithm 

In a naive implementation of the randomized GP algorithm, every processor passes to its neigh- 
bor the list of nodes to be informed by that processor. As described above, in such an imple- 
mentation, node would pass to node vr(l) the list even(7r(2), . . . , 7r(n — 1)) of length smaller 
than n/2. This requires 0(n log n) bits to be appended to the initial rumor. Since the length 
of the list halves with each successful communication request, in every level of the execution 
tree the total number of bits that need to be communicated is 0(n log n): We say that a pro- 
cessor state is on the ith level if on its unique path to the root exactly t successful calls have 
happened. For t < log(n — 1), in the tth level, there are at most 2* informed processors, all 
of which send the rumor to their descendants. Each such processor needs to append a list of 
length at most n/2* +1 . This makes a total number of 0(n log n) additional bits that need to be 
communicated on the tth level. Since there are O(logn) levels in total, the total bit complexity 
of this implementation is 0(nlog 2 n). 

Another implementation of the randomized GP algorithm with the same (asymptotic) bit 
complexity but a smaller maximal appendix is the following. If a processor needs to commu- 
nicate to its neighbor a list L = (ji, . . . ,jk) of length k > n/logn, it appends to the rumor 
the incidence list of L; i.e. a 0/1 vector x of length n — 1 with = 1 if i € {ji, . . . , j^} and 
Xi = otherwise. If a processor has received such a rumor with appended task list x, it creates 
a random permutation tt x of the indices {i \ Xi = 1}. It then proceeds as usual, trying to 
spread the rumor to processor tt x (1) in the next round. If less than n/logn indices need to be 
communicated, it is cheaper to pass the list itself. It is easily verified, using similar arguments 
as above, that this implementation yields a total bit complexity of 0(n log 2 n). However, the 
length of the longest appendix is just linear in n. 

In Section [5] we describe an alternative algorithm in which the maximal size of a message 
appendix is in the order of logn bits. 



4.2 The Time Complexity of the Randomized GP Algorithm 

For bounding the time complexity of the randomized GP algorithm we first show that hcp{n, b), 
the time complexity of this algorithm for given n and b , is monotone decreasing in the failure 
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pattern 6, according to the following natural partial order on binary sequences: (61,62,...) < 
(ci, C2, ...) if and only if for all i E N we have hi < c%. 

Lemma 10. The junction /i G p(-,-) '■ ^0 x {0, 1} N — > K is monotone decreasing in its second 
argument. That is, for any failure pattern b, replacing failed processors by non-faulty ones 
cannot increase the time- complexity. 

Lemma 10 says that, for any failure pattern b , replacing failed processors by non- faulty ones 



cannot increase the time complexity of the execution. The proof of Lemma 10 uses the following 
statement, which — informally — says that for each possible failure pattern b , splitting the rumor 
spreading at the very beginning between two processors cannot increase the time complexity of 
the execution. 

Lemma 11. For all n G No and all 6 6 {0, 1} N it holds that 

h GP (n,b) > max{h GP (\%], odd(b)),h GP (l%\, even(b))} . (4) 

Proof of Lemma\ll\ The proof is by induction on n. The lemma clearly holds for n = and 
n = 1. So let n > 2. Assume first that b = 0c . Then by the definition of Tqp, 

h GP (n,0c) = 1 + h GP (n - l,c). 
Using the identities [|] — 1 = LppJ and LfJ = ["^i^l> we a ^ s0 nave 

h GP ( LfJ , odd(0c )) = 1 + h GP ( L^J , even(c)) and 
h G p(l%\ , even(Oc)) = h GP (l^] , odd(c)) , 

which implies Q by induction. 

The case b = lc follows along the same lines. To simplify the notations for this case, define 
n\ = f^^p] , n2 = L^2~^J ; d = odd(c), and e = even(c). Then by the definition of Tqp, 

h G p(n,lc) = 1 + m&x{h G p(n 1 ,d),h GP (n 2 ,e)}. 

And the inductive step follows from the inequalities 

h GP {\^\ , odd(lc)) = 1 + max{/ iGP ( \J 1 , odd(e )), h GP ( LfJ , even(e ))} 
< 1 + h G p(n2,e) (by induction hypothesis), 

and h G p(m\,even(lc)) = h G p(ni,d). □ 



Proof of Lemma \ 1 0| The proof is by induction on n. For n = we have that h G p(0,b) = 
for all b , and the lemma trivially holds. For the induction step, let n > and let b , c be two 
vectors such that b < c. Then b = b\d and c = c\e, where 61 < c\ and d < e . If 61 = c\ = 
then b = Od , c = Oe and the induction step holds by 

h GP (n + 1, 06 ) = 1 + h GP {n, d ) > 1 + h GP (n, e) = h GP (n + 1, 0c ) , 

where the inequality follows from the induction hypothesis. The case b\ = c\ = 1 is similar and 
omitted. So we are left with the case 6 = 0d , c = le with d < e . In this case we have 

h GP (n + 1, Od ) = 1 + h GP (n, d) 

> l + max{/ lG p(rfl,odd(d)),/ lG p(LfJ,even(d))} 

> 1 + max{/i GP ( [f 1 , odd(e )), h GP ( LfJ , even(e ))} 
= h GP (n + 1, le), , 



where the first inequality follows from Lemma [TT| and the latter inequality follows from the 
induction hypothesis on odd(d ), odd(e ) and on even(d ), even(e ). □ 
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Note: A similar but slightly more involved argument shows that Lemma 10 holds also for 
the function hwu{'-> ")• 



Theorem 12. Let n G N. Let e = J Let / < (n - 1)(1 - e) and Zei F C [n - 1] o/ size 

l-Fl = f . Le£ » = 1 ^-r. Let c > 1 6e a constant. 

The probability that the randomized version of the GP algorithm has time complexity T < 
p _ £ ( |~log(n — 1)] + 1) is at least 1 — ^tzj exp ^— ^ 2| P (|4og(n — 1)] — 1)") , even if all processors 
in F fail. 



As we mentioned after Theore m [5 the probability bound is 1 — o(l) for c satisfying 
2^1^(2) > !• The proof of Theorem 12 is via a reduction to the random failure model ana- 
lyzed in Section 3.1 It makes use of several Chernoff bounds and the monotonicity proven 
in Lemma 10 We basically show that the runtime of the randomized protocol is not worse 
than that of the basic deterministic GP algorithm under the presence of independent random 
failures. In the latter, we chose the failure probability to be slightly larger than the "fair" ratio 
f/(n — 1), so that, with probability at least 1 — n~ 2 , more than / nodes are crashed. Combining 
this with the resulting runtime bound from Theorem [9] proves Theorem 12 



Proof of Theorem 12, It is easy to verify that the randomized GP algorithm with / adversarial 
failures has the same performance as the original GP algorithm when a random subset of node 
failures, RC. [n — 1] with R = /, is selected uniformly. We analyze the latter. 

Let p' := p-e and T := ^([log(n - 1)] + 1). Let b G {0, l}™" 1 with \b | = / be chosen 
uniformly at random. We need to show that 

n 3 / fc-l) 2 

Pr[hGp(n,b) >T}< ^-exp -L^([log(n - 1)1 - 1] 
n z — 1 \ 2c 

Let c G {0, l}™ -1 be such that Pr[c j = 1] = p' independently for all i G [n— 1]. That is, the 
probability that c t = is + e, for every i G [n — 1]. We show 

Pr[|c| >/]>l-n- 2 , (5) 

which can be easily verified by Chernoff's bound: The expected value of |c|o is / + e(n — 1). 
By Chernoff's bound, cf. Theorem [3j equation ([!]), we have 

Pr[|c | < /] = Pr[|c| < E[|c | ] - e(n - 1)] < exp (-2(e(n - l)) 2 /(n - 1)) 
= exp (— 2e 2 (n — 1)) = exp(— 2 Inn) = n~ 2 . 

Next we argue that 

Pr[h GP (n, b)>T]< Pr[h GP (n, c ) > T \ \c\ > f] . (6) 

To verify ([6]), assume |c|o > /• Sample k := \c\o — f indices from the 0-positions 

{i G [n — 1] | Cj = 0} of c uniformly at random. Create d from c by replacing the zeros in 
positions ii,...,^ by ones. Then d is uniform in the set {b G {0, l} n_1 | |b|o = f}, as is 
b. Inequality (|6j) follows from the latter and the monotonicity of hcp(n,b) in b, as stated in 
Lemma [101 

Using this inequality we bound 

Pv[h GP (n, b ) > T) ■ Pr[|c | > /] < Pr[h GP (n, c ) > T \ |c | > /] • Pr[|c |o > /] 

< Pr[h GP (n, c ) > T | |c | > /] • Pr[|c |„ > /] 

+ Pr[h GP (n, c) > T | \c\ < f] ■ Prflc | < /] 
= Pr[h GP (n,c)>T}. 
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The latter quantity can be bounded by Theorem [9j It shows that the time complexity of the 
GP algorithm with success rate p' satisfies 

( (c-1) 2 

Pi[h GP (n,c) >T}< nexp -(flog(n - 1)] - 1] 



2c 

Together with inequality ([5]), this concludes the proof. □ 

5 Reducing the Message Size 

Building on the results from the previous sections, we now describe an alternative version of the 
GP algorithm that has a message overhead of only O(logn) bits per rumor transfer, but that is 
still robust against adversarial failures. This is achieved at the price of adding a preprocessing 
phase and extra storage space to each of the processors. 

More precisely, we show that for t G w(n/logn), there are t permutations such that, no 
matter which constant fraction of the processors fail, the probability that a permutation chosen 
uniformly at random out of the t yields a runtime that is greater than clogra is o(l) (both the 
constant c and the o(l) failure probability will be made precise below). The algorithm is based 
on storing these t permutations at each of the processors. 

Let {it 1 , . . . ,7r*} C S n -i be the stored permutations. Upon receiving a rumor, processor 
chooses at random an index r G [t]. The algorithm now is the following minor modification of 
the original GP algorithm: At each round, a processor i which holds a nonempty list . . . , j^) 
sends a communication request to processor 7r r (ji), and deletes j% from its list. If 7r r (ji) is non- 
faulty, then i also sends it the rumor appended with (a) the index r, (b) the value j'3, (c) the 
length L^s^J of the list to be informed by processor vr r (ji), and (d) the exponent m of the 
arithmetic progression even(j2, . . . , jk). Processor 7r r (ji) starts the next round with the list 
even(j2, . . . ,jk), arid processor i starts it with the list odd(j2, . . . ,jk)- 

To pass information (b)-(d), 3([~logn] + 1) bits suffice. To pass information (a), [logt] + 1 
bits are needed. Thus, for t G 0(n d ) for a constant d, the overall number of bits that need to 
be appended to the rumor is O(logn). 



As mentioned above, the main goal of this section is to show (Theorem 14) that for t G 
uj{n/ \ogn) and suitably chosen permutations it 1 , . . . ,ir t this protocol, with high probability, is 
robust against adversarial failures. 

Definition 13. We call the GP(7r 1 , . . . ,tt*) algorithm described above (/, r, T)-safe if, for each 
possible choice of failure pattern F C [n — 1] with \F\ = f, it holds that with probability at least 
r the runtime of the protocol GP(7r 1 , . . . , tt*) with failure pattern F is at most Tq 

Interestingly, for any constant d < 1 and for t £ uj(n/ logn), t randomly chosen permutations 
7T 1 , . . . , 7r* are (dn, 1 — o(l), 0(logn))-safe, with high probability. 

Theorem 14. Let t G co (n/ logn). Let it 1 , . . . , 7r* be taken from S n -i independently and uni- 
formly at random. Let e := \fbvnj (n — 1), f < (n — 1)(1 — e), and p := 1 — ^£r- 

There are c = c(n) < 6 + o(l) and 5 = 5(n) with lim n _ 5 . 00 5(n) = 0, such that the probability 
that GP(vr 1 ,...,^*) is (/, 1 - 5, -§-(flog(n - 1)1 + l))-safe isl-o{n- 1 ). 

J) c 



The proof of Theorem [14] is based on Theorem [12] By that theorem we know that, for a fixed 
failure set F and a random permutation ir, the probability that the randomized GP algorithm 
along permutation it and failure set F exceeds the desired runtime T is less than n~ c . Based 



2 The probability statement in this definition is with respect to the random choice of the permutation index 
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on this, we show that the fraction of w(n/logn) randomly chosen permutations that exceed 
runtime T is less than 5, with probability exponentially small in n. A union bound over all 
possible failure patterns F concludes the proof. 



Proof of Theorem 14 By the assumption on t we have a function 5 = 5(n) satisfying S ■ t > 
2n/log(n) and linij^oo 8(n) = 0. Select such 5 satisfying also 5(n) > e/n . 

We now define c = c(n) > 1. Fix — for the moment — some failure set F C [n — 1] of 
size \F\ = f. For given it 1 . . .tt 1 , let T l be the time complexity of the GP algorithm along 
permutation n l if all processors in F fail. Since the index i is chosen uniformly, the probability 
that the runtime of the GP algorithm with failure set F exceeds 

T:= — (riog(n-l)l +1) 
p — e 



is the fraction of indices i £ [t] with T % > T. By Theorem 12 we know that, for a random 
permutation a of [n— 1], the probability that the runtime of the GP algorithm along permutation 
a and failure set F exceeds T is at most 

n 3 / (c-l) 2 \ ( (c-l) 2 

exp - v ' (flog(n-l)l-l) =n-exp - v ; logw(l + o(l)) 



n 1 -I L \ 2c Vl v " ' ) V 2c 

We select c = c(n) such that g < n -2 . Using the fact that logn = ln(n)/ln(2) ~ 1.44 ln(n), it 
can be easily verified that c can be chosen such that c < 6 + o(l) as claimed. 

We now show that for this value of c, the probability that the fraction of indices i with 
T % > T is larger than 5, is exponentially small. To obtain the statement of the theorem, we will 
then do a union bound over all possible choices of F. 

The probability that for the fixed F and randomly chosen permutations a 1 , . . . ,cr* at least 
to of them yield a runtime exceeding T is at most 



±(p(i-i)<-><±p<±(^) : 

j=to XJ/ j=t J ' j=t v J 7 



(7) 



where the right inequality is by Stirling approximation for j\. 

We now set to := T^l > which, by definition of 5 is at least 2n/ log(n). By the definition of 5 
and c, the sum above is dominated by the sum of the geometric progression (a* , a* 0+1 , . . . , a*) for 
a > eqt/to w eg5. Since 5 > e/n, the value of a can be chosen to be less than 1/n, for sufficiently 
large n. Thus, the first element in this progression, a* , is smaller than n _2n//log ( n ) = 2~ 2n . 
Hence, the sum in ([7]) is smaller than 2 • 2~ 2n . 

We now do a union bound over all possible choices of F. There are at most (") < 2 n different 
choices. Therefore, the probability that for t randomly chosen permutations there exits a choice 
of F such that the corresponding runtime is larger than T, is smaller than 2 n -2-2 -2 ™ = 2-2 _n . □ 

Note that the definition of 5 in the above proof implies that if t(n) > 2n 2 /logn then 5 in 



Theorem 14 can be set to 5{n) = e/n. 

It remains a major challenge to make our construction explicit; i.e., to construct a set of 
0(n/ log n) permutations which do not allow the adversary a choice of / failures such that, for 
a random one of the permutations the resulting runtime is w(logn) with constant probability. 
Note that such explicit construction will save the need for the preprocessing phase (needed to 
generate and distribute the random permutations among the processors). This is particularly 
appealing in environments where processors may be added or removed from the networks. 
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6 Conclusions and Future Work 



We studied fault-tolerant rumor spreading algorithms in the initial failures model, where, for 
arbitrary p £ (0, 1), an arbitrary set of pn processors may fail. The algorithms are based on 
introducing randomization to the message-optimal deterministic algorithm of [GP96 . They 



have minimal message complexity and asymptotically optimal time complexity, do not require 
activation of uninformed processors, and maintain the whispering property: in each round, the 
set of edges along which processors communicate form a matching. 

We proved that the time complexity of the GP algorithm in the presence of random ini- 
tial failures is asymptotically optimal, i.e. O(logn). The analysis is based on a new random 
makeup model and a novel coupling technique, which could be of independent interest. To deal 
with adversarial failures, we have proposed a randomized version of the GP algorithm. While 
the randomized GP algorithm achieves best possible message complexity and asymptotically 
optimal time complexity, it has an overhead of a total number of 0(n log 2 n) bits that need 
to be communicated. We have shown that, using a preprocessing step for storing u;(n/logn) 
random permutations at the processors, we can decrease this overhead to 0(n log n) bits. In 
this protocol, the maximum number of bits that need to be appended to any message is in the 
order of log n. 

Our construction uses probabilistic arguments and is not constructive. It remains a chal- 
lenging question to give an explicit selection of u>(n/ logn) permutations such that the resulting 
time complexity of the rumor spreading protocol is, with high probability, logarithmic in the 
number of processors. 
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